Search CORE

26 research outputs found

Using the Fisher Vector Approach for Cold Identification

Author: Gosztolya Gábor
José Vicente Egas López
Publication venue: 'University of Szeged'
Publication date: 01/01/2021
Field of study

SZTE Publicatio Repozitórium - SZTE - Repository of Publications

Using the fisher vector approach for cold identification

Author: Egas-López José Vicente
Gosztolya Gábor
Publication venue: University of Szeged, Institute of Informatics
Publication date: 01/01/2021
Field of study

In this paper, we present a computational paralinguistic method for assessing whether a person has an upper respiratory tract infection (i.e. cold) using their speech. Having a system that can accurately assess a cold can be helpful for predicting its propagation. For this purpose, we utilize Mel-frequency Cepstral Coefficients (MFCC) as audio-signal representations, extracted from the utterances, which allowed us to fit a generative Gaussian Mixture Model (GMM) that serves to produce an encoding based on the Fisher Vector (FV) approach. Here, we use the URTIC dataset provided by the organizers of the ComParE Challenge 2017 of the Interspeech Conference. The classification is done by a linear kernel Support Vector Machines (SVM); owing to the high imbalance of classes on the training dataset, we opt for undersampling the majority class, that is, to reduce the number of samples to those of the minority class. We find that applying Power Normalization (PN) and Principal Component Analysis (PCA) on the Fisher vector features is an effective strategy for the classification performance. We get better performance than that of the Bag-of-Audio-Words approach reported in the paper of the challenge

University of Szeged

Adaptation of Speaker and Speech Recognition Methods for the Automatic Screening of Speech Disorders using Machine Learning

Author: Egas López José Vicente
Publication venue
Publication date
Field of study

This PhD thesis presented methods for exploiting the non-verbal communication of individuals suffering from specific diseases or health conditions aiming to reach an automatic screening of them. More specifically, we employed one of the pillars of non-verbal communication, paralanguage, to explore techniques that could be utilized to model the speech of subjects. Paralanguage is a non-lexical component of communication that relies on intonation, pitch, speed of talking, and others, which can be processed and analyzed in an automatic manner. This is called Computational Paralinguistics, which can be defined as the study of modeling non-verbal latent patterns within the speech of a speaker by means of computational algorithms; these patterns go beyond the linguistic} approach. By means of machine learning, we present models from distinct scenarios of both paralinguistics and pathological speech which are capable of estimating the health status of a given disease such as Alzheimer's, Parkinson's, and clinical depression, among others, in an automatic manner

SZTE Doktori Értekezések Repozitórium (SZTE Repository of Dissertations)

Assessing Parkinson’s Disease from Speech Using Fisher Vectors

Author: Egas López José Vicente
Gosztolya Gábor
Orozco-Arroyave Juan Rafael
Publication venue: 'International Speech Communication Association'
Publication date: 01/01/2019
Field of study

SZTE Publicatio Repozitórium - SZTE - Repository of Publications

Repository of the Academy's Library

Identifying Conflict Escalation and Primates by Using Ensemble X-Vectors and Fisher Vector Features

Author: Gosztolya Gábor
José Vicente Egas López
Kiss-Vetráb Mercedes
Tóth László
Publication venue: 'International Speech Communication Association'
Publication date: 01/01/2021
Field of study

SZTE Publicatio Repozitórium - SZTE - Repository of Publications

Identifying Conflict Escalation and Primates by Using Ensemble X-Vectors and Fisher Vector Features

Author: Egas López José Vicente
Gosztolya Gábor
Kiss-Vetráb Mercedes
Tóth László
Publication venue: 'International Speech Communication Association'
Publication date: 01/01/2021
Field of study

Repository of the Academy's Library

Slerosis multiplex felismerése spontán beszédből wav2vec 2.0 modellekből kinyert jellemzőkkel

Author: Bóna Judit
Egas-López José Vicente
Gosztolya Gábor
Hoffmann Ildikó
Svindt Veronika
Publication venue
Publication date: 01/01/2023
Field of study

A slerosis multiplex (SM) a központi idegrendszer krónikus gyulladásos megbetegedése. Mivel az SM többek között az alanyok beszédét is befolyásolja, az automatikus beszédelemzés egyszer¶, relatíve ol só és találkozásmentes (távoli) módot kínálhat a beszédproduk ió változásainak detektálására. Egy ilyen automatikus elemz® eljárás fejlesztésének során azonban kritikusnak bizonyulhat, hogy milyen jellemz®ket nyerünk ki a beszédproduktumból. Cikkünkben tíz wav2ve 2.0 modell segítségével számítunk jellemz®ket, az így kapott osztályozási eredményeket pedig nagymennyiség¶ adaton tanított publikus, valamint kevesebb, de magyar nyelv¶ adaton magunk által tanított x-vektor neurális hálók használatával kapott eredményekkel is összevetjük. Kísérleteinkben a többnyelv¶ fonetikus készletre tanított wav2ve 2.0 modellek hatékonyabbnak bizonyultak, mint az alap (�base�) modellek. A legfontosabb attribútumnak ugyanakkor a modell paraméterszáma t¶nik: a legjobb eredményt az egymilliárd tanítható paraméterrel bíró modell adta. Emellett azt találtuk, hogy a modell �nomhangolása a élnyelvre (esetünkben a magyarra) javít az eredményeken, ugyanakkor (legalábbis kísérleti eredményeink alap ján) más nyelvre �nomhangolni nem érdemes. Meglep® módon nem sikerült viszont túlszárnyalnunk az x-vektorok teljesítményét, mely véleményünk szerint valószín¶leg a keretszint¶ beágyazások bevett, de talán túlságosan egyszer¶ felvételszint¶ aggregá ió jának tudható be

University of Szeged

Enyhe kognitív zavar automatikus felismerése szekvenciális autoenkóder használatával

Author: Balogh Réka
Egas-López José Vicente
Hoffmann Ildikó
Imre Nóra
Kálmán János
Pákáski Magdolna
Tóth László
Vetráb Mercedes
Publication venue
Publication date: 01/01/2022
Field of study

Az enyhe kognitív zavar (EKZ) hetegorén klinikai szindróma. Főbb tünetei közé tartozik a memória, a gondolkodás, az érvelés és a nyelvi képességek romlása, amely azonban nem okoz jelentős zavart a páciensek mindennapi életviteélben. A hanyatlás enyhe foka és a lappangó tünetek miatt azonban az EKZ diagnosztizálása nagyon gyakran ütközik nehézségekbe. Ebben a tanulmányban szekvenciális autoenkódert használunk a jellemzőknyeréshez, hogy robusztus és hatékony attribútumokat extraktálhassunk. A felhasznált adadtbázis 25 EKZ-s alany és 25 egészséges kontrollszemély hanganyagait tartalmazza. Eredményeink alapján ez a megközelítés versenyképes teljesítményt nyújt: egy nagyobb adatbázison tanított x-vektor hálóval szemben is képes jobb eredményeket nyújtani. További kísérleteinkben enyhe Alzheimer-kórban (eAK) szenvedő alanyokat is megpróbáltuk elkülöníteni

University of Szeged

Cross-lingual detection of mild cognitive impairment based on temporal parameters of spontaneous speech

Author: Balogh Réka
Egas López José Vicente
Gosztolya Gábor
Hoffmann Ildikó
Imre Nóra
Kálmán János
Pákáski Magdolna
Tóth László
Vincze Veronika
Publication venue: 'Elsevier BV'
Publication date: 01/01/2021
Field of study

Repository of the Academy's Library